Skip to content

Conversation

@e06084
Copy link
Collaborator

@e06084 e06084 commented Dec 11, 2025

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @e06084, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves issues in the RAG (Retrieval Augmented Generation) examples by adapting them to recent API changes in the evaluation framework. The updates ensure that evaluation results are correctly processed and displayed, and enhance file path handling for test datasets, making the examples functional and aligned with the current library specifications.

Highlights

  • RAG Evaluation API Update: The RAG evaluation examples have been updated to reflect changes in the underlying evaluation library's API. Specifically, the eval_details and eval_status attributes are no longer used, with the code now accessing the result object directly or its status attribute.
  • Improved Path Handling: The CSV_FILE_PATH in sdk_rag_eval_batch_dataset.py now utilizes pathlib.Path for more robust and object-oriented file path management, and points to a new location within a test/data directory.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes the RAG examples by updating them to use the new API for evaluation results, where eval_status is renamed to status and eval_details is removed in favor of using the result object directly. The changes are correct and consistent across all example files.

I've added a few suggestions for examples/rag/sdk_rag_eval_batch_dataset.py to improve maintainability by addressing code duplication, a misleading variable name, and inconsistent logging. These changes would make the example code cleaner and easier to understand.


# 输入文件路径配置
CSV_FILE_PATH = "ragflow_eval_data_50.jsonl" # 支持CSV和JSONL格式
CSV_FILE_PATH = Path("test/data/ragflow_eval_data_50.jsonl") # 支持CSV和JSONL格式
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The variable name CSV_FILE_PATH is misleading as the comment and the code logic indicate that it can also be a .jsonl file. To improve clarity, consider renaming it to something more generic like INPUT_FILE_PATH. Note that this will require updating its usages in the main function as well.

Suggested change
CSV_FILE_PATH = Path("test/data/ragflow_eval_data_50.jsonl") # 支持CSV和JSONL格式
INPUT_FILE_PATH = Path("test/data/ragflow_eval_data_50.jsonl") # 支持CSV和JSONL格式

Comment on lines 128 to 159
print("\n1. 忠实度 (Faithfulness):")
faithfulness_result = LLMRAGFaithfulness.eval(data)
print(f" 状态: {'✅ 通过' if not faithfulness_result.eval_status else '❌ 未通过'}")
print(f" 状态: {'✅ 通过' if not faithfulness_result.status else '❌ 未通过'}")
print(f" 分数: {faithfulness_result.score}/10")
total_faithfulness += faithfulness_result.score

logger.info("\n2. 上下文精度 (Context Precision):")
print("\n2. 上下文精度 (Context Precision):")
precision_result = LLMRAGContextPrecision.eval(data)
logger.info(f" 状态: {'✅ 通过' if not precision_result.eval_status else '❌ 未通过'}")
logger.info(f" 状态: {'✅ 通过' if not precision_result.status else '❌ 未通过'}")
logger.info(f" 分数: {precision_result.score}/10")
print(f" 状态: {'✅ 通过' if not precision_result.eval_status else '❌ 未通过'}")
print(f" 状态: {'✅ 通过' if not precision_result.status else '❌ 未通过'}")
print(f" 分数: {precision_result.score}/10")
total_precision += precision_result.score

print("\n3. 上下文召回 (Context Recall):")
recall_result = LLMRAGContextRecall.eval(data)
print(f" 状态: {'✅ 通过' if not recall_result.eval_status else '❌ 未通过'}")
print(f" 状态: {'✅ 通过' if not recall_result.status else '❌ 未通过'}")
print(f" 分数: {recall_result.score}/10")
total_recall += recall_result.score

print("\n4. 上下文相关性 (Context Relevancy):")
relevancy_result = LLMRAGContextRelevancy.eval(data)
print(f" 状态: {'✅ 通过' if not relevancy_result.eval_status else '❌ 未通过'}")
print(f" 状态: {'✅ 通过' if not relevancy_result.status else '❌ 未通过'}")
print(f" 分数: {relevancy_result.score}/10")
total_relevancy += relevancy_result.score
#
print("\n5. 答案相关性 (Answer Relevancy):")
answer_relevancy_result = LLMRAGAnswerRelevancy.eval(data)
print(f" 状态: {'✅ 通过' if not answer_relevancy_result.eval_status else '❌ 未通过'}")
print(f" 状态: {'✅ 通过' if not answer_relevancy_result.status else '❌ 未通过'}")
print(f" 分数: {answer_relevancy_result.score}/10")
total_answer_relevancy += answer_relevancy_result.score
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code for evaluating metrics is almost identical to the one in the evaluate_from_csv function (lines 271-302). This code duplication makes the code harder to maintain, as any change in the evaluation logic needs to be applied in two places. Consider extracting this logic into a separate helper function to improve maintainability and reduce redundancy.

Comment on lines 134 to 141
logger.info("\n2. 上下文精度 (Context Precision):")
print("\n2. 上下文精度 (Context Precision):")
precision_result = LLMRAGContextPrecision.eval(data)
logger.info(f" 状态: {'✅ 通过' if not precision_result.eval_status else '❌ 未通过'}")
logger.info(f" 状态: {'✅ 通过' if not precision_result.status else '❌ 未通过'}")
logger.info(f" 分数: {precision_result.score}/10")
print(f" 状态: {'✅ 通过' if not precision_result.eval_status else '❌ 未通过'}")
print(f" 状态: {'✅ 通过' if not precision_result.status else '❌ 未通过'}")
print(f" 分数: {precision_result.score}/10")
total_precision += precision_result.score
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logging for 'Context Precision' is inconsistent with other metrics in this loop. It logs to both the logger and stdout, and the calls are duplicated, whereas other metrics only print to stdout. For consistency and to remove duplication, I suggest using only print here, similar to the other metrics.

Suggested change
logger.info("\n2. 上下文精度 (Context Precision):")
print("\n2. 上下文精度 (Context Precision):")
precision_result = LLMRAGContextPrecision.eval(data)
logger.info(f" 状态: {'✅ 通过' if not precision_result.eval_status else '❌ 未通过'}")
logger.info(f" 状态: {'✅ 通过' if not precision_result.status else '❌ 未通过'}")
logger.info(f" 分数: {precision_result.score}/10")
print(f" 状态: {'✅ 通过' if not precision_result.eval_status else '❌ 未通过'}")
print(f" 状态: {'✅ 通过' if not precision_result.status else '❌ 未通过'}")
print(f" 分数: {precision_result.score}/10")
total_precision += precision_result.score
print("\n2. 上下文精度 (Context Precision):")
precision_result = LLMRAGContextPrecision.eval(data)
print(f" 状态: {'✅ 通过' if not precision_result.status else '❌ 未通过'}")
print(f" 分数: {precision_result.score}/10")
total_precision += precision_result.score

@e06084 e06084 merged commit 153a19d into MigoXLab:dev Dec 11, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant